Font Size: a A A

Spatial And Semantic Information Joint Extraction And Intelligent Integration From Satellite Stereo Image Pairs

Posted on:2024-06-13Degree:DoctorType:Dissertation
Country:ChinaCandidate:P Y LiaoFull Text:PDF
GTID:1520307292460204Subject:Photogrammetry and Remote Sensing
Abstract/Summary:PDF Full Text Request
High spatial resolution stereo satellites provide earth observational imagery data from multiple viewpoint,which contains abundant spatial and semantic information.The spatial information extraction methods reconstruct the geometric morphology of the shooting area by photogrammetry technology,while the semantic information extraction methods pixel-wise divide the images according to the ground object category by semantic segmentation algorithms.Research on the information extraction from satellite stereo image pairs is essential for remote sensing applications such as digital mirroring,urban planning,land use monitoring,and disaster assessment,and thus holds significant theoretical and practical value.However,previous research took the extraction of spatial and semantic information as two seperate tasks,with insufficient exploration of the complementarity and relevance between these two types of information.There is currently no method to integrate and model the multi-modal information,or to analyze and present observational scene with multiple perspectives and high accuracy.To investigate into the relationship between spatial and semantic information,this paper introduces deep learning theory and technology to propose a method for spatial and semantic information joint extraction and intelligent integration based on satellite stereo image pairs,according to the properties of projection model in satellite imagery.This paper includes research on image matching algorithms,stereo matching and semantic segmentation multi-task learning algorithms,terrain reconstruction methods,and multi-modal information integration methods.Experiments are conducted using satellite imagery such as Sentinel-2,Zhuhai-1,World View-2/3,Gaofen-7,Super View-1 to verify the effectiveness of the proposed method.Furthermore,a spatial-semantic integrated model was established using the US3 D dataset to demonstrate the neccessity of combining spatial and semantic tasks.The research presented in this paper includes:Firstly,to address the issue of low positional accuracy and uneven distribution of tie points generated by previous image matching algorithms,this paper proposes a deep learning framework called DRRD based on the “describe-and-detect” approach.This method calculates the distinguishability of each pixel for keypoint detection,which avoids the fusion of keypoints with neighboring pixels.This method includes an image representation network with an unchanged receiptive field,and employs an adpative mix-content triplet loss function to train the model using multi-spectral imagery datasets in a self supervised manner.Experiments on multi-temporal images demonstrate that this framework generates tie points with high positional accuracy and even distribution,to build the rational polynomial parameter adjustment model and enables posture calculation of stereo image pairs.Secondly,oriented to the unstable performance in dataset without model training during stereo matching and semantic segmentation,this paper establishes a multi-task learning network.This network extracts spatial information from monocular images and semantic information from binocular images,accumalates them with a learnable weigh by spatial self-attention mechanism,and generates disparity maps and classification maps simultaneously.To further improve the generalizability of the deep model,this paper incorporates the continuity of disparity values and the duality of rear-front view images,and designs data augmentation methods based on stereo image properties to increase the diversity of training data.Comparative experiments using satellite imagery,aerial imagery,and street view imagery demonstrate that the proposed multi-task learning method significantly improves the robustness and accuracy of deep models,and provides pixel-level image understanding for stereo image pairs.Thirdly,in photogrammetry pipelines,the accuracy of extracted spatial information from each pixel is determined by epipolar rectification and disparity estimation,which can be impacted by the range of disparity value in epipolar image pairs.Furthermore,the ortho rectification on rear-front view stereo image pairs with an accurate DSM leads to an epipolar image pair with low disparity value.Therefore,this paper ultilizes a pyramid pipeline to iteratively refine DSMs,which reduces the disparity value and improves epipolar rectification and stereo matching.Based on this,this paper models the collinear relationship between object points,image pixels and imaging sensors by the rational function model,projects the extracted twodimensional image information into three-dimensional geographic space,and achieves scene reconstruction based on satellite image pairs.Fourth,the integration of multi-modal information in remote sensing applications provides more extensitive knowledge than single-modal information,and depicts terrain morphology and object distribution from multiple perspectives.This paper proposes a method for joint extraction of multi-modal information,which generates accurate three-dimensional reconstruction results and image interpretation results from high spatial resolution satellite stereo image pairs.Moreover,this method intelligently integrates spatial and semantic information,mitigating model errors caused by misclassification and mismatching,thereby developing a spatial-semantic integrated model.This method expands the application of stereo satellite imagery,and serves a wide range of scientific research and engineering practices.
Keywords/Search Tags:Satellite Stereo Image Pair, Image Matching, Stereo Matching, Semantic Segmentation, Scene Reconstruction
PDF Full Text Request
Related items