Font Size: a A A

Research On Real-time 3D Object Tracking And Motion Analysis Based On Feature Fusion

Posted on:2022-04-02Degree:DoctorType:Dissertation
Country:ChinaCandidate:J C LiFull Text:PDF
GTID:1488306608477284Subject:Physics
Abstract/Summary:PDF Full Text Request
Research on computer vision is mainly divided into low-level vision,middlelevel vision,and high-level vision.Specifically,low-level vision mainly refers to pixel-level image processing,such as edge detection,filtering and multi-view geometry;middle-level vision mainly refers to feature-level image analysis,such as contour extraction,segmentation and tracking;high-level vision refers to the image understanding,such as pattern recognition and 3D reconstruction.3D object tracking technology uses the low-level visual features of the image to estimate the pose of the object relative to the camera in continuous video frames,which is an important middle-level vision task.The visual 3D object tracking technology directly estimates the object's pose in the space,avoiding adding artificial markers to preserve the original scene.So it greatly improves the naturalness of the interaction.At the same time,with the iterative update of mobile devices and the development of portable wearable devices,3D object tracking technology will show broader application prospects in the fields of augmented reality,intelligent manufacturing,healthcare,autonomous driving,robotics,and entertainment.After using 3D tracking technology to obtain low and middle-level information such as the object pose,the movement of the object can be further analyzed for highlevel semantics.It is still quite challenging to accurately estimate the object's pose and to extract the low and middle-level semantic information to the highlevel semantic information.This paper focuses on the problem of high-precision,hard real-time texture-less 3D object tracking,and the semantic analysis of head movement.The main contributions of this paper are as follows:1.This paper proposes a multi-feature fusion monocular texture-less 3D object tracking method based on bundle structure,which is used to improve the accuracy of 3D tracking.The single-feature-based methods have their inherent shortcomings.The existing methods based on multi-feature fusion are only the direct combination of multiple features,which can not fully play the role of various features.This paper proposes an adaptive weighted local bundle structure,based on which it defines the energy function of multi-feature fusion to deal with complex scenes.The optimized image is divided into several sub-regions through the bundle structure,and the edge and color features are fused to deal with the spatial inconsistency of different features and give full play to the role of different features.The experimental results show that the method in this paper can improve the overall tracking accuracy in complex scenes and show effectiveness.2.This paper proposes a high-speed monocular texture-less 3D object tracking method based on geometric contours and local regions,which is used to improve the tracking speed.The reprojection process is a critical step in 3D object tracking,which is also the main bottleneck limiting the tracking speed.This paper uses the geometric properties of the object to extract the geometric contour to estimate the coarse pose quickly and then uses the local region to finetune the pose.The proposed strategy can reduce the reprojection process in the tracking to one time,which greatly improves the calculation efficiency while ensuring tracking accuracy.3.This paper proposes a precise texture-less 3D object tracking method based on multi-view RGB information,achieving sub-millimeter tracking precision.To make up for the lack of information at camera view direction in the monocular 3D object tracking,this paper proposes an optimization framework based on the object-centered model to obtain a more precise pose.It establishes constraints between multiple cameras and texture-less objects.The unified energy function eliminates the pose uncertainty through multi-view information and unifies the image features and coordinate system.The experimental results show that the proposed method greatly improves the translation precision in the camera sight direction,and its high-precision results can be used for the dataset construction.4.This paper proposes two head motion analysis methods based on the hierarchical model,which analyzes the high-level semantics of head posture motion by fusing multiple features.Aiming at problems such as difficulty in data labeling and poor model versatility,this paper proposes two hierarchical models to complete the low-level semantics extraction to the high-level semantics understanding.Through the constructed hierarchical structure,the model in this paper can complete the semantic analysis of short-term video only by labeling data in the long-term video.Experimental results show that both models can perform effective high-level semantic analysis on head posture-related videos and can be applied to cervical spine health assessment and attention analysis scenarios.
Keywords/Search Tags:3D Object Tracking, Feature Fusion, Mobile Phone Tracking, Muti-view Tracking, Head Motion Analysis and Understanding
PDF Full Text Request
Related items