Font Size: a A A

Research And Application Of Monocular 3D Object Detection Algorithm Based On Deep Learning

Posted on:2024-04-03Degree:MasterType:Thesis
Country:ChinaCandidate:T WangFull Text:PDF
GTID:2568307079472364Subject:Electronic information
Abstract/Summary:PDF Full Text Request
With the rapid development of artificial intelligence technology,urban traffic has gradually entered the era of modernization and intelligence.In order to effectively alleviate road congestion pressure,improve traffic efficiency and capacity,it is necessary to use holographic intersection scene vehicle positioning and tracking technology to obtain environmental information under the intersection.This thesis proposes a monocular 3D object detection algorithm,which conducts research from two aspects:feature extraction and fusion,as well as deep fusion strategies,and applies it to vehicle positioning and tracking tasks in holographic intersection scenes.The work overview of this thesis is as follows:(1)A feature extraction and fusion network based on window interaction multi-head self-attention mechanism and global attention mechanism is proposed,which can effectively obtain global context information in images and retain target detail information and position information.The network solves the problem that traditional CNNs are difficult to obtain a wide receptive field and global context information due to convolution operation limitations,as well as the problem that target detail information is lost due to excessive downsampling.(2)A depth fusion strategy based on unequal confidence is proposed,which considers that different depth prediction methods should have unequal confidence in predicting depth values,and obtains the final depth value by weighted averaging.The strategy solves the problem of how to optimize the fusion of multiple depth information and fully consider their complementarity and difference.(3)A smart traffic holographic intersection monitoring system is designed and implemented,in which the monocular 3D object detection algorithm proposed in this thesis is applied.The system receives data from video analysis models and video fusion modules,analyzes,processes,outputs data to designated remote servers,provides decision support services for users,thereby providing auxiliary services for smart traffic scenarios.
Keywords/Search Tags:Monocular vision, 3D object detection, attention mechanism, contextual information
PDF Full Text Request
Related items