Main > Acta Optica Sinica >  Volume 40 >  Issue 14 >  Page 1415001 > Article
  • Abstract
  • Abstract
  • View Summary
  • Figures (8)
  • Tables (1)
  • Equations (0)
  • References (20)
  • Get PDF(in Chinese)
  • Paper Information
  • Received: Mar. 4, 2020

    Accepted: Apr. 13, 2020

    Posted: Jul. 1, 2020

    Published Online: Jul. 23, 2020

    The Author Email: Da Feipeng (1095528214@qq.com)

    DOI: 10.3788/AOS202040.1415001

  • Get Citation
  • Copy Citation Text

    Mingyang Cheng, Shaoyan Gai, Feipeng Da. A Stereo-Matching Neural Network Based on Attention Mechanism[J]. Acta Optica Sinica, 2020, 40(14): 1415001

    Download Citation

  • Category
  • Machine Vision
  • Share
Acta Optica Sinica, Vol. 40, Issue 14, 1415001 (2020)

A Stereo-Matching Neural Network Based on Attention Mechanism

Cheng Mingyang1,2, Gai Shaoyan1,2, and Da Feipeng1,2,3,*

Author Affiliations

  • 1School of Automation, Southeast University, Nanjing, Jiangsu 210096, China
  • 2Key Laboratory of Measurement and Control of Complex Systems of Engineering, Ministry of Education, Southeast University, Nanjing, Jiangsu 210096, China
  • 3Shenzhen Research Institute, Southeast University, Shenzhen, Guangdong 518063, China

Abstract

To improve the accuracy of stereo matching based on binocular vision applied to weak texture scenes, this study proposes a 3D reconstruction algorithm based on feature extraction using an attention mechanism. The proposed model uses convolutional neural network (CNN) to train feature representation of left and right images and calculates the matching cost of stereo matching. First, during the CNN feature extraction stage, attention mechanism module and channel attention mechanism module are summed to obtain the connection of each pixel in the feature image, enabling the network to capture the context information better and reconstruct weak texture areas more accurately in the reconstruction process. Second, we integrate the semantic coding loss in our neural network. The final loss function is defined as the weighted sum of the semantic coding loss and the reconstruction loss, which can effectively improve the reconstruction accuracy of a region with weak texture. We use KITTI and Sceneflow datasets to validate the algorithm. Experimental results show that the proposed method yields good improvements in accuracy, particularly in areas with weak textures.

keywords

Please Enter Your Email: