Font Size: a A A

Research On Adaptive Method For Dynamic Target Positioning And Tracking On Monocular Mobile Platform

Posted on:2022-12-25Degree:DoctorType:Dissertation
Country:ChinaCandidate:Z H WangFull Text:PDF
GTID:1488306764458764Subject:Navigation, guidance and control
Abstract/Summary:PDF Full Text Request
With the rapid development of autonomous vehicles and UAVs,the technologies of target recognition,positioning and tracking based on motion vision platforms has become a popular research field in robotics and artificial intelligence.So far,the theoretical research on target positioning and tracking based on 2D images has been relatively developed,but there are still lots of difficulties in 3D space positioning and tracking of targets at low-cost motion vision platforms,such as the low model detection accuracy resulting from the lack of samples,the high dependency of the neural network model on the computing performance of platforms,the inaccurate estimation of target space depth by monocular vision,and the scale uncertainty and scale drift of the monocular vision odometry method.These problems severely restrict the application of moving platforms including ground robots and UAVs in target space positioning,intelligent obstacle avoidance,autonomous positioning and navigation.In this dissertation,systematic research was carried out in response of key issues such as small sample set image augmentation methods for target tracking problems,multiscale detection network model training methods with depth parameters,spatial positioning algorithms under geometric constraints,and the mesoscale deblurring method of monocular visual odometry.In addition,based on the motion vision platform(handheld cameras,mini rotor UAVs,etc.),a large number of indoor and outdoor simulations and UAV flight experiments were carried out to demonstrate the effectiveness of the work related to this paper.Finally,a set of target adaptive positioning and tracking technology framework based on the monocular vision platform was proposed to research some essential problems,including the detection network model training in the scenario of limited samples,the construction of a high-quality moving target sample set,and the optimization of the multi-scale detection model,the fast target depth estimation and spatial positioning under the monocular vision and the uncertain scale recovery in monocular vision odometry.The main innovations and contributions of this paper are as follows:1)A small sample set augmentation model for tracking targets at the moving platform(Limited Sample Sets Augmentation for Object Tracking,LSSA-OT)was proposed to deal with the platform jitter and the scarcity of initial training samples in the scenario of object tracking in a moving platform.By increasing the number of sample sets and improving the dimension of sample features,it can address the problem of weak generalization ability caused by the difficulty of sample features to cover the imaging features of moving targets in the scenario of lack of samples.First,the model can augment the geometric features of the samples through similarity,affine,and projection transformations,to achieve the dimensional expansion of the target geometric features.Then,a new random background filling method was proposed to solve problems caused by missing image edge information after geometric transformation while improving the image quality at the negative sample area to avoid the imbalance of the positive and negative sample feature information scale.Furthermore,in response to image blur caused by the moving platform jitter and the rapid movement of a target,a multi-directional stacking blur augmentation algorithm was proposed to enhance the anti-noise ability of sample sets to fuzzy images.Finally,a traditional augmentation method was adopted to obtain adjustable augmentation probability parameters for each augmentation module in the model,so that the construction of the image augmentation model for small sample sets in the moving platform scenario can be achieved.According to the result of the experiments,this method improved the detection accuracy of the detection model by at least 8% in the situations where only a small number of samples were available for model training,the moving target had spatial rotation,and the target was fuzzy and jittering.In particular,for targets with planar features,the augmentation methods of affine and projective transformation adopted in this paper can increase the model detection accuracy by more than 12%.2)A single-stage multi-scale neural network detection model(Fast Depth-Assisted Single Shot Multi Box Detector,FDA-SSD)based on the auxiliary training of sample space depth parameters,was proposed to address the problems that the neural network detection model has lots of running parameters and poor real-time computing ability.It can pre-learn the spatial depth information of a target with the help of the image training sample set with motion parameters so that the model can adaptively match an optimal detector to optimize the model running parameters and detection speed.First,a method for establishing image sample sets with motion parameters for target tracking was proposed to match the sample images,sample target motion parameters,and camera motion parameters,thereby obtaining a high-quality training sample set.Secondly,training for fussing the sample set with the original SSD model was conducted via a feedforward neural network,so that the FDA-SSD model could obtain the matching relationship between the depth parameters and the multi-scale feature classifier.In addition,the 2D image sample training was upgraded into 3D spatial sample training with the fusion of spatial scale parameters,so that the model can automatically match the optimal detector,thus improving the efficiency of target recognition.According to the results of experiments,at the GTX1060 computing platform,the computing speed of the FDA-SSD model was 28.7 fps,which was about 16.5% faster than that of the traditional SSD.The root mean square error of target tracking was less than 4.72 cm in the target tracking experiment based on FDA and geometric constraint positioning method.3)A single-frame parallel-features positioning method(SPPM)was proposed to deal with the slow estimation of depth information and the difficulty in the spatial positioning of targets in the tracking scenario,and it achieves the rapid estimation of the target depth and the accurate spatial positioning of a target by extracting the geometric constraint relationship of the target,establishing a model to solve the depth and utilizing the scale information and geometric features of the target to be tracked.The first was to segment the edge of the target and extract key feature points to obtain the spatial constraint relationship of the target feature points.Second,the target scale,geometric constraints,and projection equation obtained in the initialization stage of tracking the target were used to construct a model to solve the target depth.Finally,given the characteristics that nonlinear,high-order and overdetermined nature in the equations for deep solving make it difficult to obtain analytical solutions,a Runge-Kutta numerical iterative method with fifth-order convergence was introduced to realize the fast solving of the target depth and the spatial positioning.According to the result of the experiments,SPPM had good robustness in the indoor positioning experiments.The mean square error percentage of indoor target static positioning estimation was less than 1.04%,and the mean square error of UAVs' dynamic target tracking was less than 7.26 cm.4)A monocular visual odometry method with a multi-feature scale stabilizer(Multilevel Scale Stabilizer for Visual Odometry,MLSS-VO)was proposed to deal with the scale uncertainty and scale drift in monocular visual odometry by integrating the target scale parameters in the tracking problem with the pose data of the traditional monocular visual odometry and utilizing the transfer and update of the multi-layer feature scale.The first is to classify the different features in monocular vision images and define the feature baselines and baseline acquisition rules at all levels,to realize the abstraction and classification of the scale information in the images.Second,the feature baseline information extracted from the features of each layer was used to establish a multi-scale stabilizer,thereby realizing the transfer and update of the multi-level feature baseline scale information.Finally,a back-end optimization method of traditional visual odometry was used to deal with scale fiducial blur and the accumulation of pose calculation errors in the autonomous positioning problem of the monocular moving platform.According to the results of experiments,the indoor autonomous positioning error based on MLSS-VO was less than 3.87 cm,and the target tracking and positioning method based on MLSSVO had an error of less than 5.97 cm.Based on the summarization of the above four innovation points,this paper proposes a technical framework for target adaptive positioning and tracking based on a monocular vision platform.The technical framework can be widely used in frontier fields such as target recognition,target tracking,and autonomous positioning and navigation of motion platforms including ground robots with vision sensors,autonomous vehicles,and rotor UAVs.
Keywords/Search Tags:Monocular Target Localization, Image Augmentation, Deep Learning, Localization based on Geometric Constraints, Visual Odometry
PDF Full Text Request
Related items