Font Size: a A A

Research On Key Technologies Of Video Multiple Object Tracking Based On Data Association

Posted on:2021-01-05Degree:DoctorType:Dissertation
Country:ChinaCandidate:P X LiuFull Text:PDF
GTID:1368330626455636Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
As an important task in computer vision,multiple object tracking is widely used in application and research fields including video surveillance,traffic control,autonomous driving,and human-computer interaction.With the rapid development of object detection technology,the method of first locating objects with the help of detection technology and then using data association technology to generate object trajectories has become the mainstream of multiple object tracking.This method is called data association-based multiple object tracking.In actual situations,however,factors such as object motion mode,surrounding environment,and video imaging can often be very complex.Therefore,many difficult problems in the multiple object tracking technology still need to be resolved.In this paper,in-depth research is conducted on several key problems focusing on data association-based multiple object tracking:(1)Random appearance and disappearance of the objects during the association process can cause the continuous and dynamic change in the number of objects,which can bring challenges to the optimization and solution of data association.(2)The extraction of high-discriminative feature between objects in the tracking environment and noise suppression of detection data.(3)The close mutual influence of pedestrian multi-object motion and the accurate estimation of complex motion laws.(4)In a dense scene,frequent occlusion of objects can result in heavy segmentation of object trajectories,which poses a problem of information enhancement and effective association in this difficult situation.The main research content of this dissertation is as follows:(1)A network flow data association algorithm is proposed to solve the uncertainty problem of the number of objects.Based on the trajectory segment,a data association model of network flow is established,which transforms the data association problem into solving the minimum cost flow in the network.The iterative shortest-path algorithm is used to solve each object trajectory.In the algorithm,a forward-backward searching method based on appearance features is designed,which can effectively handle the problem of determining the appearance and disappearance of objects in the scenes of still cameras and moving cameras,so as to flexibly adjust the network structure and corresponding cost parameters.In order to prevent the occurrence of incorrect cross-association,a time cost constraint across nodes is introduced.(2)Aiming at the problem of distinguishing objects in the tracking scene,a feature extraction network based on Siamese stacked auto-encoder is designed.Trajectory segment data association is performed based on features with high discrimination,thereby generating a complete trajectory of each object.The Siamese structure and its contrast loss function enable this network to minimize the feature distance between positive samples and increase the feature distance between negative samples.In order to better distinguish objects in the scene,the training samples of this network are collected from the detection data at the current moment.A random resampling method on the detection position and size is used in the sample generation,which not only enlarges the sample size,but also has the effect of suppressing the detection noise.The network structure adopts a relatively simple two-layer form to adapt to the small sample data for effective training.In the training process of small sample data,this network introduces an auto-encoder constraint to avoid the occurrence of network overfitting.The tracking system extracts object features with high discrimination based on an online incremental learning method,and iteratively generates a more reliable trajectory segment.Based on the trajectory segment,compound features integrating appearance and motion are also extracted to enhance the data association capability of the trajectory segment.(3)A short-term and long-term memory network model based on visual field information and a conditional random field data association model are proposed to solve the problem of motion estimation for close trajectories of multiple pedestrians.In the designed short-term and long-term memory networks,pedestrian vision information is used to effectively screen pairs of highly associated motion trajectories,on which joint motion estimation is performed.The conditional random field data association model uses trajectory segment pairs with the possibility of time association as nodes,and constrains the establishment of edges using the visual information memory network model.The association probability of the trajectory segment in each node is converted into univariate energy,and the association probability of the trajectory segment in the node pair with connected edge is converted into binary energy,thereby realizing joint data association of close-range high-association trajectory segment pairs.The data association of trajectory segment is converted into solving the problem of minimum energy,so as to obtain the complete trajectory of the object.(4)Dense crowd is one of the most difficult environments in terms of multiple object tracking of pedestrians.This task is very challenging even if there are multiple cameras available.This study proposes a Markov random field model based on cross-view coupled trajectory segments.The model has a new potential function enhancement method that is capable of effectively associating coupled trajectory segments caused by dense pedestrians.The cross-view coupled trajectory segments are obtained by a data fusion method based on image mutual information.The method can calculate the spatial position relationship between cross-view two-dimensional trajectory segments by integrating the position and motion information.Additionally,the human key-point detection method is used to correct the position data of the incomplete and deviated objects in a dense crowd.The potential function enhancement method for dense pedestrian scenes includes two measures.The former is assimilation and its expansion.It uses the soft connection with the longer trajectory segments to enhance the information of fine segments and expands by sharing the information,thereby improving the potential function of related nodes.The latter is a message selective belief propagation algorithm,which designs message selection rules to prevent the propagation of unreliable messages of fine segments in the Markov network.With the help of these two measures,the potential function of the Markov random field model can be improved and enhanced through iterations,and the effective association of the trajectory segments is realized,so that the dense crowd can be tracked in a robust manner.The methods proposed in this paper have been tested with public experimental data,and results show that these methods can very effectively approach the above key technical problems.They are therefore effective methods for the task of data association-based multiple object tracking.
Keywords/Search Tags:multiple object tracking, data association, Siamese neural network, long short-term memory network, Markov random field
PDF Full Text Request
Related items