Font Size: a A A

Research On Text Representation Based On Siamese Neural Network And Hybrid Neural Network

Posted on:2020-05-03Degree:MasterType:Thesis
Country:ChinaCandidate:J Q LiuFull Text:PDF
GTID:2428330578483128Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Text representation,which aims to numerically represent the unstructured texts,is the fundamental task in Natural Language Processing.In recent years,with the de-velopment of deep learning,the neural network model has shown strong ability of data representation.Many researchers have begun to use neural network models to learn text representation.This paper studies the text representation based on neural network from two aspects:unsupervised learning and supervised learning.For unsupervised learning,the neural topic model based on the Variational Au-toencoder(VAE)becomes a popular model for text representation.However,the neu-ral topic model does not take into account the similarity between different texts,which may lead to a large difference between text representations with similar semantics.For supervised learning,Convolutional Neural Network(CNN)and Recurrent Neural Net-work(RNN)are two typical paradigms for text representation and have complementary advantages.Some researches attempt to combine CNN and RNN to construct a hybrid model with the hope to keep strengths of two models.However,previous methodology treats word equally and may not provide insight into the importance of words.In order to overcome the above problems,the main work in this paper is as follows:1.We propose a novel model named Siamese neural topic model.We extend the neural topic model by using Siamese network,which seeks to introduce the sim-ilarity between texts as a constraint during the training process.The sub-structure of Siamese network depends Information Maxing Variational Autoencoder as the neural topic model,which improves the correlation between latent variables and texts.2.We propose a novel hybrid model which combines CNN and RNN.In our model,a CNN is applied to learn a weight matrix where each row reflects the importance of each word from different aspects.We design a new word context representation method.By using the neural tensor layer to fuse the hidden states of the word in the bidirectional re-current neural network,the semantics of the word in the context can be more accurately captured.3.The experimental results on different datasets confirm the effectiveness of our proposed models.
Keywords/Search Tags:Text Representation, Variational Autoencoder, Topic Model, Convolu-tional Neural Network, Recurrent Neural Network
PDF Full Text Request
Related items