Font Size: a A A

Double Attention Mechanism For Sentence Embedding

Posted on:2019-05-19Degree:MasterType:Thesis
Country:ChinaCandidate:KAKANAKOU MIGUEL STEPHANEFull Text:PDF
GTID:2348330569480153Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Sentence embedding is a learned representation for a sentence where sentences that have the same meaning have a similar representation.It is then a class of natural language processing techniques where individual sentences are represented as real-valued vectors in a predefined vector space.It helps to perform the modelisation of the semantic dimension of the sentence by generating a similar vector for the similar sentence.The goal is then to train a model that can automatically transform a sentence to a vector that encodes the semantic meaning of the sentence.Sentence embedding plays an important role in many of high-level NLP tasks such as Text classification,Question Answering,Machine Translation...The motivation of this thesis is thus to get a better and meaningful representation of sentences in order to allow the high-level NLP tasks to achieve a better accuracy.The recent model of sentence embedding does not make an efficient use of the attention mechanism that could help to improve the accuracy of the model.This current thesis proposes a new model for sentence embedding that efficiently use the attention mechanism to improve the accuracy of the model.The proposed model uses a double attention mechanism to combine of a recurrent neural network(RNN)and a convolutional neural network(CNN).First,the proposed model uses a bidirectional Long Short Term Memory Recurrent Neural Network(RNN-LSTM)with a self-attention mechanism to compute a first representation of the sentence called primitive representation.Then the primitive representation of the sentence is used along with a convolutional neural network with a pooling based attention mechanism to compute a set of attention weights used during the pooling step.The final sentence representation is obtained after concatenation of the output of the CNN neural network with the primitive sentence representation.The double attention mechanism helps the proposed model to retain more information contained in the sentence and then to be able to generate a more representative feature vector for the sentence.The model can be trained end-to-end with limited hyper-parameters.We evaluate our model on three different benchmarks dataset for the sentence classification task and compare that with the state-of-art method.Experimental results show that the proposed model yields a significant performance gain compared to other sentence embedding methods in all the three dataset...
Keywords/Search Tags:Attention
PDF Full Text Request
Related items