Font Size: a A A

Local Feature Extraction Based On Deep Learning And Its Application

Posted on:2020-04-23Degree:MasterType:Thesis
Country:ChinaCandidate:J X LiFull Text:PDF
GTID:2428330602455356Subject:Computational Mathematics
Abstract/Summary:PDF Full Text Request
With the continuous breakthrough and development of science and technology,as well as the growing demand for security measures in various industries,the application of video monitoring system is becoming more and more extensive,especially the research of intelligent video monitoring is getting more and more attention.Video monitoring enables the collection of dynamic mobile video images,which can be received in a mobile way through professional monitoring products.Video monitoring includes front-end acquisition,image transmission,terminal image extraction,storage,control and display.Video monitoring is generally used for remote monitoring,also known as remote network monitoring,which means that the monitor is not around the monitoring camera or other camera acquisition equipment,and can view the scene of on-site monitoring video from a long distance through the network,so that it can realize the demand of real-time viewing of the scene even if the monitor is not on-site.The monitoring system is small and relatively stable,which liberates people from the boring work and does not produce physiological problems such as visual fatigue.Video surveillance is used in all aspects of life and brings convenience to people's life.For example,traffic monitoring can monitor the road conditions in a large range,so that traffic police can receive the notice and arrive at the scene as soon as possible after the accident;the video monitoring system installed in the supermarket bank can ensure the legitimate rights and interests of consumers and personal safety.However,the traditional video surveillance described above cannot liberate people completely,and it also needs to analyze the behavior of moving objects in the video manually.If the system can analyze the behavior of the objects automatically while monitoring,that is to say,to realize intelligent monitoring,it can save more human,financial and material resources,and at the same time reduce people's workload.It can also save more human,financial and material resources and ensure economic benefits.The core technology of intelligent video monitoring is human behavior recognition which is to identify and analyze the target.These recognition and analysis can be divided into gesture recognition,behavior recognition and event analysis,in order to achieve what the target is doing and what it will do.Human behavior analysis is mainly achieved by extracting behavior features and classifying them.The features to be extracted include inter frame and intra frame features,rectangular moment features and motion speed features.The methods of feature classification include multi feature human behavior recognition,background subtraction,difference and optical flow.There are many kinds of human behavior recognition technologies.Nowadays,the mainstream research method of human behavior recognition based on machine learning is deep learning method,which is mainly used to solve the problem of behavior recognition / action recognition in video.There are two ways to solve this problem: one is to extract and classify temporal and spatial features for video recognition;the other is to extract skeleton information for retraining for attitude estimation,including two stream method,C3 D method and CNN-LSTM method.Because the above two kinds of methods can't classify the behavior recognition into multi tags,the softmax regression method can't be implemented effectively,which makes it difficult to add the video character image as a certain type of feature when processing the multi classification task,so it can't transform the feature into a decision,resulting in slow progress or even failure of detection and analysis.At the same time,this method cannot randomly scramble the data set before the system training,so it cannot ensure that the model data input in different rounds are different,which may lead to repeated crash training.In this process,the fine tune of the model cannot be carried out smoothly,the initialization of the model network is blocked,resulting in the low learning rate of the system,the fine tuning of the learning rate settings of different layers will not affect the network model,resulting in the failure of the generalization ability of the model,the lack of training convergence process of forward calculation,resulting in the greatly reduced accuracy of positive and negative samples,top-1 / 5 or confusion matrix.Based on deep learning,this paper optimizes the current research methods and models,and puts forward the optimization algorithm mode based on deep learning,so as to improve the accuracy of feature extraction and feature analysis.Then the description and extraction of target shape,space-time and regional features are implemented.By fusing multiple local features of a single target,new and upgraded feature variables are generated.The improved feature extraction method has great breakthrough and progress in the aspects of spatiotemporal interest points,accuracy of feature extraction and feature analysis.In order to achieve global or local optimization,the classification accuracy of the description model to the problem is continuously expanded,and the deviation between the classification result and the real value is reduced.
Keywords/Search Tags:Target Detection, Feature Extraction and Fusion, Behavior Recognition, Template Matching
PDF Full Text Request
Related items