Font Size: a A A

Research On Visual Tracking And Depth Estimation Algorithms For Remote Environment Target Perception

Posted on:2021-03-16Degree:DoctorType:Dissertation
Country:ChinaCandidate:C G GuoFull Text:PDF
GTID:1368330647460723Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
In the case of frequent natural disasters and accidents in our country,in order to ensure the personal safety of rescuers,the development of a remote unmanned operation platforms is a new demand for emergency disaster relief.In order to provide operators with accurate remote environment awareness,computer vision methods are used to extract information such as the orientation,posture,and depth of various targets in the remote environment,which can avoid potential dangers and achieve precise operation guidance.Aiming at the needs of target depth perception in remote scenes,this paper studies computer vision theoretical algorithms in two directions,namely visual target tracking and binocular stereo depth estimation.Certain theoretical research results have been achieved and an algorithm implementation scheme has been designed.In recent years,visual tracking methods based on traditional mathematical models and stereo matching methods based on deep neural networks have received extensive attention and continued follow-up from researchers.On the one hand,the time-varying target morphology and target appearance in complex environments pose challenges to existing tracking algorithms.How to mine the local stable appearance features of the target and achieve fast and robust visual tracking still requires in-depth research.On the other hand,the latest stereo matching algorithms reduce the matching error by designing various complex deep neural network structures,but still cannot output high resolution depth map in real time.How to improve the inference efficiency of the stereo matching algorithm by simplifying the network structure,and to accurately predict the fine structures such as edges in the depth map still remains difficult for practical application.In this thesis,the mathematical models and algorithmic ideas in existing visual tracking methods based on sparse representation model and correlation filtering model are completely reviewed,and the development of the main stereo matching method based on deep neural network is summarized.Then,two robust target tracking algorithms and two lightweight stereo matching networks are proposed respectively.Among the proposed tracking algorithms,the combination of sparse representation and correlation filtering is explored to a certain extent.In the proposed stereo matching algorithms,the mapping relationship between the intensity edge of the input high-resolution color image and the disparity edge of the predicted disparity map is discussed.The main work can be divided into the following four points:(1)Aiming at the problem of redundant repetition of feature extraction and coding in existing visual tracking algorithms based on forward sparse respresentation models,a real-time sparse tracking algorithm based on circulant reverse sparse model is proposed.The algorithm is based on a reverse sparse representation model,and uses the candidate target sample set generated by the cyclic shift operator as a sparse dictionary to sparsely encode the target template.Because the sparse coding process is only reversely solved once for the target template,and the optimization formular containing the circulant shift target candidate feature set is converted to the frequency domain,the proposed algorithm optimization process is very efficient.Compared with the classic sparse tracking algorithm,the overall performance is better and the running speed is accelerated.(2)Aiming at the problem that the existing target tracking algorithms based on discriminant correlation filtering ignore the local spatial structure information of the target,a sparse regularized correlation filter tracking algorithm based on the spatial tree structure is proposed.This algorithm introduces the group sparse regularization term that expresses the hierarchical structure of the target's internal space into the correlation filtering target optimization formular to apply regularization constraints to the local filter groups at different levels,in order to express the relationship between the local appearance features and the expected response at different spatial locations of the target.Key steps in the optimization process are also converted to the frequency domain based on the properties of the circulant matrix to facilitate fast solutions.Compared with the correlation filtering tracking algorithm based on the holistic model,the performance metric of the proposed sparse correlation tracking algorithm based on local spatial structure shows better results.(3)Aiming at the problem that the disparity upsampling structure of the existing end-to-end stereo matching network model is slow,and fine structures such as edges cannot be effectively recovered,an improved end-to-end stereo matching network model is proposed.In the proposed network model,the local adaptive awareness convolution structure and a related loss term for the disparity upsampling and refinement stage are mainly studied.Based on this shared convolution structure,the semantic relationship between image intensity pixels and disparity pixels at different upsampling stages is explored.During the training process,the discontinuous edges in the predicted disparity map are adaptively perceived through the gradient update of the shared convolution weights.Experiments show that the disparity upsampling structure in the proposed stereo matching network is more effective than the upsampling structure of direct cascading intensity features and disparity features,and the stereo matching network has better prediction accuary and speed.(4)Aiming at the practical problem of low reasoning efficiency of the existing stereo matching network models,a deep convolutional neural network structure that effectively combines low-resolution disparity estimation and super-resolution subnet is proposed.This structure takes the principle of reducing the operational resolution of the convolutional layer,constructs a matching cost volume,and performs cost aggregation and disparity regression at a low resolution level to quickly obtain the initial disparity map.Then it uses the proposed super-resolution subnet to complete the fast hierarchical upsampling of the initial disparity map,and perform high frequency information supplementation and disparity noise refinement.Compared with the latest end-to-end stereo matching network,the proposed new model has higher prediction accuracy and faster prediction speed.In summary,this thesis has studied the core algorithms in two computer vision directions: visual tracking and stereo matching,which provides technical support for the remote environment visual perception application of unmanned construction machinery.Finally,it also provides system design ideas and relevant simulation experiments for practical applications.
Keywords/Search Tags:correlation filtering, target tracking, stereo matching, depth perception, remote stereo vision
PDF Full Text Request
Related items