Font Size: a A A

Research On Image Matching Method For Challenging Scenarios And Its Applications

Posted on:2024-04-22Degree:DoctorType:Dissertation
Country:ChinaCandidate:Z W ShenFull Text:PDF
GTID:1528307208457874Subject:Pattern Recognition and Intelligent Systems
Abstract/Summary:PDF Full Text Request
With the flourishing development of artificial intelligence technology,visual perception,as one of the most prominent perceptual technologies,is driving continuous breakthroughs and innovations in the field of computer vision.Visual tasks such as image stitching,object tracking,3D geometry reconstruction,and motion structure recovery all require support from image matching.Image matching serves as a bridge between low-level vision and high-level vision,establishing spatial correspondences between images or objects.However,image matching often faces significant challenges in scenes with weak textures,repetitive structures,and lighting variations.Therefore,research on image matching methods for weak texture scenes is of great significance.This paper focuses on the research of image matching methods and their applications in weak texture scenes.The main work and innovations include:(1)This paper proposes an attention-based image matching method called TRLWAM(Transformer with Linear-Window Attention for Feature Matching).By designing a visual Transformer attention model,specifically the Linear-Window Attention model,attention computation is constrained within non-overlapping local windows,and attention is computed using linear dot product of kernel feature maps.The Linear-Window Attention model is then applied to the image matching method,constructing a Transformer-based image matching method with a linear-window attention mechanism(TRLWAM),achieving dense matching.The experimental results demonstrate that compared to several state-of-the-art methods,this approach significantly reduces the execution time and computational memory while maintaining a competitive or superior performance.The proposed linear window attention model in this method improves the matching efficiency and enables the extraction of dense and accurate matches in weak texture scenes.(2)This paper proposes a real-time image stitching method based on TRLWAM.By applying knowledge distillation techniques to TRLWAM(the proposed method),a lightweight image matching model is constructed,enabling real-time image stitching tasks.The self-attention mechanism of the Transformer captures effective global information.The K-means algorithm is used to optimize point selection and filter feature points.Based on the matching results,the homography matrix values are calculated to achieve pixel transformation.The final stitching result is obtained by Gaussian fusion of target image layers.Experimental results show that compared to traditional methods and other deep learning approaches,this method generates more matches in weak texture scenes and achieves real-time performance.The image stitching process has the shortest execution time among all compared methods and yields good stitching results.(3)This paper proposes a fast image matching method based on Multiple-Layer Perceptron(MAIM).By designing a hybrid multiple-layer perceptron structure model called Mixer-WMLP,the feature map is evenly divided into non-overlapping windows and unfolded as tokens to facilitate token information exchange between spatial positions.An image matching method is constructed based on this network model,where channel information exchange is performed on the features through a two-layer mixed multiple-layer perceptron structure in the coarse-level module.The fused features are then input into the fine-level module for dense fine-level matching,resulting in the final match.Additionally,the implemented hybrid multiple-layer perceptron model for image matching has a global receptive field.The experimental results demonstrate that this image matching method outperforms Transformer-based image matching methods in terms of computational memory and matching time.When compared to traditional methods and CNN-based methods,the MAIM method exhibits advantages in matching performance.(4)This paper proposes a visual odometry method based on MAIM.A visual odometry based on a hybrid multilayer perceptron(MLP)structure is designed using an image matching model with a hybrid MLP structure.After obtaining the matched point pairs,the camera’s relative pose is determined by minimizing the reprojection error between key points.Experimental results on multiple publicly available dataset sequences and real-world environments demonstrate that accurate pose estimation can be achieved even in weak texture scenes.Compared to current mainstream monocular visual odometry methods,this approach achieves higher relative pose accuracy.
Keywords/Search Tags:Image matching, Attention mechanism, Knowledge distillation, Global receptive field, Multi-layer perceptron
PDF Full Text Request
Related items