Journal of Guangdong University of Technology ›› 2020, Vol. 37 ›› Issue (04): 35-41.doi: 10.12052/gdutxb.190140

Previous Articles     Next Articles

A Monocular Depth Estimation Combined with Attention and Unsupervised Deep Learning

Cen Shi-jie, He Yuan-lie, Chen Xiao-cong   

  1. School of Computers, Guangdong University of Technology, Guangzhou 510006, China
  • Received:2019-11-18 Online:2020-07-11 Published:2020-07-02

Abstract: To solve the problem of boundary blurring of current unsupervised monocular depth estimation method, a network architecture is proposed based on dual attention module. This architecture can effectively solve the problem of boundary blurring of depth estimation by using long-range context information of image features. The model framework that includes depth estimation network and pose estimation network is trained by an unsupervised method based on view synthesis and estimation depth and camera pose transformation at the same time. The dual attention module is embedded in the depth estimation network, including position attention module and channel attention module. This module can represent the long-range spatial location and the context information between different feature maps, so that the network can estimate the depth information with better details. The experimental results on the KITTI dataset and the Make3D dataset show that our method can effectively improve the accuracy of the monocular depth estimation and can solve the depth estimation boundary blur problem.

Key words: depth estimation, unsupervised learning, deep learning, attention, robotics

CLC Number: 

  • TP249
[1] 朱福利, 曾碧, 曹军. 基于粒子滤波的SLAM算法并行优化与实现[J]. 广东工业大学学报, 2017, 34(2): 92-96
ZHU F L, ZENG B, CAO J. Parallel optimization and implementation of SLAM algorithm based on particle filter [J]. Journal of Guangdong University of Technology, 2017, 34(2): 92-96
[2] XIE J, GIRSHICK R, FARHADI A. Deep3D: Fully automatic 2D-to-3D video conversion with deep convolutional neural networks[C]//European Conference on Computer Vision. Amsterdam: Springer, 2016: 842-857.
[3] GARG R, BG V K, CARNEIRO G, et al. Unsupervised CNN for single view depth estimation: Geometry to the Rescue[C]//European Conference on Computer Vision. Amsterdam: Springer, 2016: 740-756.
[4] GODARD C, AODHA O M, BROSTOW G J. Unsupervised monocular depth estimation with left-right consistency[C]//IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 6602-6611.
[5] ZHOU T H, BROWN M, SNAVELY N, et al. Unsupervised learning of depth and ego-motion from video[C]//IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 6612-6619.
[6] MAHJOURIAN R, WICKE M, ANGELOVA A, et al. Unsupervised learning of depth and ego-motion from monocular video using 3D geometric constraints[C]//IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 5667-5675.
[7] YIN Z C, SHI J P. GeoNet: unsupervised learning of dense depth, optical flow and camera pose[C]//IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 1983-1992.
[8] HUANG J, LEE A B, Mumford D. Statistics of range images[C]//Proceedings IEEE Conference on Computer Vision and Pattern Recog-nition. Hilton Head Island: IEEE, 2000: 324-331.
[9] FU J, LIU J, TIAN H, et al. Dual attention network for scene segmentation[C]//IEEE Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 3146-3154.
[10] HE K, ZHANNG X, REN S, et al. Deep residual learning for image recognition[C]//IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas: IEEE, 2016: 770-778.
[11] WANG Z, BOVIK A C, SHEIKH H R, et al. Image quality assessment: from error visibility to structural similarity [J]. IEEE Transactions on Image Processing, 2004, 13(4): 600-612
[12] GODARD C, AODHA O M, BROSTOW G J, et al. Digging into self-supervised monocular depth estimation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 3828-3838.
[13] EIGEN D, PUHRSCH C, FERGUS R. Depth map prediction from a single image using a multiscale deep network[C]//Proceedings of the 27th International Conference on Neural Information Processing Systems. Cambridge: MIT press, 2014: 2366-2374
[14] LIU F, SHEN C, LIN G, et al. Learning depth from single monocular images using deep convolutional neural fields [J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2015, 38(10): 2024-2039
[15] ZOU Y, LUO Z, HUANG J, et al. DF-Net: unsupervised joint learning of depth and flow using cross-task consistency[C]//European Conference on Computer Vision. Munich: Springer International Publishing, 2018: 38-55.
[16] RANJAN A, JAMPANI V, BALLES L, et al. Adversarial collaboration: joint unsupervised learning of depth, camera motion, optical flow and motion segmentation[C]//IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 12240-12249.
[17] BIAN J, LI Z, WANG N, et al. Unsupervised scale-consistent depth and ego-motion learning from monocular video[C]//Proceedings of the 32th International Conference on Neural Information Processing Systems. Vancouver: MIT press, 2019: 35-45.
[1] Wu Jun-xian, He Yuan-lie. Channel Attentive Self-supervised Network for Monocular Depth Estimation [J]. Journal of Guangdong University of Technology, 2023, 40(02): 22-29.
[2] Liu Dong-ning, Wang Zi-qi, Zeng Yan-jiao, Wen Fu-yan, Wang Yang. Prediction Method of Gene Methylation Sites Based on LSTM with Compound Coding Characteristics [J]. Journal of Guangdong University of Technology, 2023, 40(01): 1-9.
[3] Xu Wei-feng, Cai Shu-ting, Xiong Xiao-ming. Visual Inertial Odometry Based on Deep Features [J]. Journal of Guangdong University of Technology, 2023, 40(01): 56-60,76.
[4] Liu Hong-wei, Lin Wei-zhen, Wen Zhan-ming, Chen Yan-jun, Yi Min-qi. A MABM-based Model for Identifying Consumers' Sentiment Polarity―Taking Movie Reviews as an Example [J]. Journal of Guangdong University of Technology, 2022, 39(06): 1-9.
[5] Zhang Yun, Wang Xiao-dong. A Review and Thinking of Deep Learning with a Restricted Number of Samples [J]. Journal of Guangdong University of Technology, 2022, 39(05): 1-8.
[6] Zheng Jia-bi, Yang Zhen-guo, Liu Wen-yin. Marketing-Effect Estimation Based on Fine-grained Confounder Balancing [J]. Journal of Guangdong University of Technology, 2022, 39(02): 55-61.
[7] Gary Yen, Li Bo, Xie Sheng-li. An Evolutionary Optimization of LSTM for Model Recovery of Geophysical Fluid Dynamics [J]. Journal of Guangdong University of Technology, 2021, 38(06): 1-8.
[8] Zhang Wei, Zhang Zhen-bin. Joint Graph Embedding and Feature Weighting for Unsupervised Feature Selection [J]. Journal of Guangdong University of Technology, 2021, 38(05): 16-23.
[9] Teng Shao-hua, Dong Pu, Zhang Wei. An Attention Text Summarization Model Based on Syntactic Structure Fusion [J]. Journal of Guangdong University of Technology, 2021, 38(03): 1-8.
[10] Lai Jun, Liu Zhen-yu, Liu Sheng-hai. A Small Sample Data Prediction Method Based on Global Data Shuffling [J]. Journal of Guangdong University of Technology, 2021, 38(03): 17-21.
[11] Liang Guan-shu, Cao Jiang-zhong, Dai Qing-yun, Huang Yun-fei. An Unsupervised Trademark Retrieval Method Based on Attention Mechanism [J]. Journal of Guangdong University of Technology, 2020, 37(06): 41-49.
[12] Zhao Yong-jian, Yang Zhen-guo, Liu Wen-yin. DIAN: Dual-aspect Item Attention Network for Item-based Recommendation [J]. Journal of Guangdong University of Technology, 2020, 37(04): 27-34.
[13] Teng Shao-hua, Feng Zhen-ye, Teng Lu-yao, Fang Xiao-zhao. Joint Low-Rank Representation and Graph Embedding for Unsupervised Feature Selection [J]. Journal of Guangdong University of Technology, 2019, 36(05): 7-13.
[14] Zeng Bi-qing, Han Xu-li, Wang Sheng-yu, Xu Ru-yang, Zhou Wu. Sentiment Classification Based on Double Attention Convolutional Neural Network Model [J]. Journal of Guangdong University of Technology, 2019, 36(04): 10-17.
[15] Gao Jun-yan, Liu Wen-yin, Yang Zhen-guo. Object Tracking Combined with Attention and Feature Fusion [J]. Journal of Guangdong University of Technology, 2019, 36(04): 18-23.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!