Font Size: a A A

Deep Neural Networks For Text Representation And Application

Posted on:2017-04-29Degree:DoctorType:Dissertation
Country:ChinaCandidate:B T HuFull Text:PDF
GTID:1108330503469920Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
In recent years, deep neural networks(DNN) have been explored on various of tasks such as image classification and speech recognitions, on which DNN achieves state of the art performances and shows the powerful ability for representation learning. Language representation is the core issue of natural language processing. The over simplified bag of word hypothesis has been the bootle neck of various tasks because of the curse of dimension, data sparsity. In recent years, DNN have dominant the research on language representation learning. However, the flexibility and rich semantic information of language make the representation learning via DNN more difficult. This thesis aims to research on the language representation learning and its applications via DNN.Firstly, the word representation learning is studied. The Continuous Dissociation between Nouns and Verbs Model(CDNV) is proposed. CDNV integrates pos tag information into the word embeddings learning process, while preserving the word order.Inspired by the principle the dissociation between nouns and verbs, the model can dynamically choose the connection on the output layers according to the pos tag information.Comparisons to most of public word embeddings show that CDNV is able to learn highquality word embeddings with relatively low time complexity. The nearest neighbors of some representative words derived from the CDNV word embeddings are more reasonable. The performance improvements on F1 measure from CDNV word embeddings are significantly greater than other word embeddings on NER and Chunking.Secondly, the sentence modeling is studied. The deep convolutional neural network sentence model is proposed. This model need not rely on parsing tree and can represent the hierarchical structures of sentences with layer-by-layer convolution and pooling. Semantic matching is of central importance to many natural language tasks. A successful matching algorithm needs to adequately model the internal structures of language objects and the interaction between them. As a step toward this goal, we propose two deep convolutional neural network based sentence matching architecture. Architecture I gets two sentence representations via two different convolutional neural networks, and then the multiple layer perceptron is used to match them. While Architecture II models the matching of two sentences directly, and score the matching representation via multiple layer perceptron. The two architectures require no prior knowledge, and can hence be applied to matching tasks of different nature or languages. The empirical study on a variety of matching tasks demonstrates the efficacy of the proposed architectures and its superiority to competitor models. Architecture II is superior to Architecture I on capturing the hierarchical matching patterns of two sentences. Architecture II achieves state of the art performances on three tasks.Thirdly, the bilingual phrases selection is studied. The Context-Dependent Convolutional Neural Network Bilingual Phrases Matching Model is proposed. It encodes not only the semantic similarity of the translation pair, but also the context containing the phrase in the source language. In order to train the model efficiently, we initialize the word embeddings by the pre-trained context dependent bilingual word embeddings. A curriculum learning algorithm is proposed to train the model, which can gradually train the model from easy to difficult. Experimental results show that our approach significantly outperforms the baseline system by 1.0 BLEU points.Fourthly, The automatic text summarization is studied. This thesis constructs a large scale Chinese short text summarization which consists of over 2.4 million data. A highquality test set is also constructed. The recurrent neural network(RNN) encoder-decoder summarization generation architecture is proposed and two models are constructed. Model I uses one RNN to model the short text(RNN-encoder) and the last hidden state is used to represent the short text. The another RNN is used to generate summary from the short text representation(RNN-decoder). Based on Model I, Model II dynamically construct the context from all hidden states of RNN-encoder. The two models require no prior knowledge. Experiment results show that the two model can generate informative summary.Especially, the generated summaries of Model II are better than Model I significantly.To sum up, this thesis used DNN to study the representation of different text grain i.e word, sentence and paragraph. The proposed models and methods are applied on the sequence labeling, sentence matching, machine translation and automatic text summarization tasks. And some works reach the state of the art performances.
Keywords/Search Tags:Deep Learning, Language Representation, Word Vector, Semantic Matching, Automatic Text Summarization
PDF Full Text Request
Related items