Font Size: a A A

Research On Key Technologies Of Real-scene 3D Multi-level Reconstruction Based On Deep Learnin

Posted on:2024-08-09Degree:MasterType:Thesis
Country:ChinaCandidate:T F WangFull Text:PDF
GTID:2552307166467124Subject:Photogrammetry and Remote Sensing
Abstract/Summary:PDF Full Text Request
3D Real Scene construction in China can be divided into terrain-level,city-level,and component-level according to the content and level of expression.With the comprehensive advancement of 3D Real Scene construction in China,the reconstruction of indoor and outdoor 3D models has gained increasing attention in the fields of photogrammetry and remote sensing.However,the low level of automation and the lack of semantic information hinder the rapid generation and widespread application of 3D models.In recent years,there has been rapid development in deep learning technology,playing a significant role in semantic recognition and generation tasks using various types of data.This paper focuses on the research of city-level outdoor 3D reconstruction and component-level indoor 3D reconstruction,aiming to propose an automated modeling framework for multi-level 3D Real Scene guided by the semantic information provided by deep learning methods.In the field of city-level outdoor 3D reconstruction,traditional manual methods suffer from inefficiency and high resource consumption.Additionally,the current automated methods of model reconstruction at the city-level can only achieve overall mesh construction,and cannot perform automated and structured individual reconstruction of buildings.To address these two issues,this paper uses satellite images and DSM as data sources to propose a LOD-1 model reconstruction technology for urban areas based on the Mask R-CNN network.Firstly,a deep learning semantic segmentation network called Mask R-CNN is utilized to extract building shape information from orthoimages,which serves as semantic guidance.Secondly,the shape information is extracted and subjected to contour extraction and regularization.Next,the regularized 2D shape information is combined with the 3D elevation information from the DSM,enabling automated modeling of city-level outdoor buildings.Finally,experiments were conducted using GF-7 imagery and DSM generated by SVS software.The experimental results demonstrate that the proposed method can achieve LOD1-level modeling of city-level outdoor buildings.In the field of component-level indoor 3D reconstruction,component-level models are more detailed and complex,requiring higher accuracy of semantic information and structural information.The point cloud may be incomplete and overlapping due to occlusion in real scene,affecting the accuracy of 3D reconstruction.Indoor modeling is challenging to automate and semanticize.To address these issues,this paper proposes a semantic and automated strategy for component-level 3D reconstruction primarily using Lidar information as the main data source.The strategy consists of three main parts:(1)Firstly,this paper proposed a point cloud semantic segmentation network called LFCG-Net based on graph encoding.The main idea of the network is to use fully connected graph structures for encoding point cloud features to provide a more comprehensive description of local point cloud feature information.Then,the concept of residual connections is applied to three-dimensional point clouds to increase the receptive field and enhance feature learning capabilities.The network also utilizes a reverse frequency weighting cross-entropy loss function to alleviate the problem of imbalanced samples in the dataset.Testing results on multiple point cloud datasets demonstrate the effectiveness of the semantic segmentation method.(2)Secondly,on the basis of the extracted semantic information,this paper proposes a method for point cloud completion based on cross-modal self-supervision.PCSL consists of two branches: masked autoencoder self-supervised learning and cross-modal contrastive learning.The masked autoencoder self-supervised learning method decodes the encoded features from the encoding network to learn the overall structural information of the point cloud and assist with point cloud completion.The cross-modal contrastive learning method projects the encoded features into a highdimensional space and uses the contrastive learning of siamese networks to obtain fused features,thereby capturing the common information of similar point clouds and assisting with point cloud representation learning.(3)Finally,based on the completed point cloud with semantic information,a component-level indoor 3D reconstruction method is proposed,which is based on point cloud semantics and model matching.The method aims to achieve component-level 3D reconstruction.Firstly,a 3D-ESF indoor model library is constructed,and a candidate model construction strategy based on semantic segmentation confidence is proposed.Then,based on the correspondence between models and point cloud categories,an automated matching and modeling method for indoor scenes is presented,which follows a coarse-to-fine approach.Finally,experiments are conducted on a subset of Scan Net data.The experimental results demonstrate that,guided by the semantic information of point clouds,the proposed automated matching and modeling method can achieve fast and accurate 3D reconstruction of component-level objects.
Keywords/Search Tags:3D Real Scene, indoor scene reconstruction, point cloud completion, semantic, 3D reconstruction
PDF Full Text Request
Related items