Font Size: a A A

Research On The Application Of Attention Mechanism In Computer Vision

Posted on:2022-06-06Degree:MasterType:Thesis
Country:ChinaCandidate:K JinFull Text:PDF
GTID:2518306539980909Subject:Control Engineering
Abstract/Summary:PDF Full Text Request
At present,with the rapid development of artificial intelligence,the algorithms based on artificial intelligence are widely used in every aspect of our life.Among them,algorithms based on computer vision develop particularly rapidly.Classification and regression problems are the two most common problems in computer vision.Most computer vision studies focus on these two kinds of problems.With the development of deep learning,more and more algorithms have been proposed to solve these problems.Among them,attention mechanism is a hot research direction recently,which has been applied to computer vision to solve various related problems.This paper mainly studies the application of attention mechanism to classification and regression problems in computer vision.For classification problems,we take facial expression recognition as an example to study,and for regression problems,we take crowd counting as an example to study.The main content of this paper is as follows:1.To solve the problem of facial expression recognition,a convolutional neural network model based on attention mechanism and LBP features is designed in this paper.First of all,LBP features can well reflect the subtle changes of facial texture and wrinkles,which is conducive to help the network to capture the subtle changes of expression.We fused LBP feature and convolution feature to enhance the extraction of key features.Then,the attention mechanism is added to make the network focus on the key features of different expressions in the image space,which can avoid the influence of useless features and irrelevant factors such as background,so as to increase the accuracy of expression recognition.In order to verify the effectiveness of the network,we collected and annotated a facial expression dataset NCUFE.The method we designed was tested on the NCUFE dataset and four representative expression datasets,namely Jaffe,CK+,FER2013 and Oulu-Casia.Experimental results demonstrate the feasibility and effectiveness of the proposed method.2.To solve the problem of crowd counting,this paper designs a dual attention mechanism-based scale aggregation network(DASANET)to solve the problem of background clutter and scale change in crowd image.In order to focus attention on the head area of the crowd and prevent complex background interference,the network embedded two attention mechanisms,namely channel attention mechanism and spatial attention mechanism.The channel attention mechanism mainly focuses the network attention on the channel where the features are more meaningful,while the spatial attention mechanism mainly pays different degrees of attention to different locations in the feature space.In order to solve the problem of scale variation,we designed an upsampling aggregation module,which contains two scale aggregation modules to adapt to the features of human heads of different scales.Experimental results on six representative population statistics datasets,including Shangitech(Part A,Part B),UCF?CC?50,World Expo '10,UCSD and UCF?QNRF,indicate that DASANet has superior performance compared to other state-of-the-art population statistics methods.
Keywords/Search Tags:computer vision, deep learning, attention mechanism, facial expression recognition, crowd counting
PDF Full Text Request
Related items