Font Size: a A A

Combing Knowledge And Neural Network For Text Representation

Posted on:2019-01-15Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y M LiFull Text:PDF
GTID:1368330548477389Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Text representations is a key task for many natural language processing applications such as text classification,text clustering,ranking,sentimental analysis and so on.The goal of it is to numerically represent the unstructured text documents so that they can be computed mathematically.Different representations may capture and disentangle d-ifferent degrees of explanatory ingredients hidden in the text.Therefore,it has attracted considerable amount of attention from many researchers,and various types of models have been proposed for text representations.Most of existing works employ generic deep learning algorithms to learn text rep-resentation.However,these works don't consider about unique properties of text.Com-pared with other domains,text itself is usually semantically ambiguous and reflects limit-ed information.In addition,text data itself has a hierarchical structure.For text semantic,a longer expression(e.g.,a document)comes from the meanings of its constituents and the rules used to combine them.For text structure,a document is composed of a list of sentences,and each sentence consists of a list of words.For these reasons,it is worth-while to study how to introduce external knowledge and the hierarchical structure of text into neural network model to generate more informative text representations.Specifically,the main contributions of this thesis are listed as follows:(1)This paper studies the conceptualization of text by using a framework that com-bines a probabilistic knowledge base and distributed text representation based on neural networks,and generates a conceptual-level distributed text representation.Through an external probabilistic knowledge base,the framework can recognise the entities in the text and disambiguate the vague entities through the context to obtain their accurate con-cepts,thus raw text is transformed into conceptualized text which is composed of a set of concepts.After that,the framework uses a distributed text representation algorithm to obtain a low-dimensional vector representation of the text.(2)This paper presents a knowledge-powered hierarchical neural network mod-el.This model integrates the multi-relational knowledge map into the neural network and uses the hierarchical structure model to generate the document representation.For the external knowledge,this model uses multi-relationship knowledge graph to produce the knowledge graph entity vectors as supplements to the background knowledge of the original text.For the model structure,this model uses two bidirectional Gated Recur-rent Unit encoders to generate sentence representations,and then uses two bidirectional Long Short-Term Memory encoders to generate the document representation.This hier-archical structure of the proposed model corresponds to the hierarchical structure of the document.(3)This paper presents a hierarchical neural network model that combines the atten-tion mechanism and external knowledge graph.In addition to the multi-relational knowl-edge graph and hierarchical network structure,this model adds the attention mechanism to the model,which can be seen as an improvement and extension of the previous model.For the sentence composition part,this model supplements the background knowledge of the original text by introducing external knowledge graph entity vectors.For the doc-ument composition part,sentences are first encoded by a bidirectional Long Short-Term Memory encoder,and then a sentence level attention mechanism is used to feedback the sentences that can help the model generate better document representation.The final document representation is the weighted sum of bidirectional LSTM's hidden states.
Keywords/Search Tags:Representation Learning, Domain Knowledge, Probabilistic Knowledge Base, Word Embedding, Knowledge Graph Embedding, Neural Network, Text Representation
PDF Full Text Request
Related items