Font Size: a A A

Fine-grained Visual Categorization And Application Based On Deep Learning

Posted on:2019-05-23Degree:MasterType:Thesis
Country:ChinaCandidate:W WangFull Text:PDF
GTID:2518305906472384Subject:Control Engineering
Abstract/Summary:PDF Full Text Request
In recent years,the great success of deep learning technologies have promoted the rapid development of computer vision.In computer vision,issues related to image recognition and image classification have always been one of the most widely studied fields;Our research topic in this paper is facial action unit recognition based on multi-methods.The article aims to improve the effect of facial action unit recognition and the focus is on the methodology of facial action unit recognition.In the introduction,the article first describes the background,significance and current status of facial action unit recognition,followed by chapter arrangements.The second part mainly elaborates the basic theory of facial action unit recognition.First of all,it starts from the general practice of image recognition,including data preprocessing,feature extraction,classifier design,etc.Taking into account that facial action unit recognition is related to face problems,it also involves the alignment of face and preprocessing of pixels in the basic theory.The third part directly cuts into the subject,for the recognition of facial action units,the method of recognizing facial action units with traditional features is basically reproduced.Specifically,the HOG+SVM method is used to identify facial action units,the HOG feature and SVM classifier are introduced in two parts,then the experimental design and result analysis are followed.In the fourth part,because of the recognition accuracy based on the previous traditional feature method is not enough,the deep learning approach that take into account the time dependence of image data and spatial position between pixels is proposed,CNN+LSTM is our choice.Specifically,an 8-layer network structure and LSTM unit are designed,then for this rough prototype,explanation of each layer's parameter settings are given;The final result shows that there is indeed a lot of optimization.In the fifth part,considering that the network is self-designed when identify facial action units using CNN+LSTM,it will bring about slow network convergence,and whether the time dependence of learning neurons directly with LSTM really reflects the relationship between frame and frame.So in this section,we first represent the dependency relationship between frame and frame as optical image,and then a two-way CNN network is proposed to identify facial action units,the results are voted and synthesized.In the experimental stage,the data labels are recoded to meet our requirements,fine-tuning the trained Google Net,and the time efficiency improves a lot,and the optical flow is extracted using different intervals,also the results obtained is also improved.In the last part,we summarized and look forward to the problem of facial action unit recognition.For the problem,we give the assumption that if we can get a label for a short video of several consecutive frames,video behavior analysis,segmented network TSN,video key frame and other methods can be introduced.
Keywords/Search Tags:Convolutional neural network, Facial action unit recognition, LSTM, Two-stream Network, SVM
PDF Full Text Request
Related items