Font Size: a A A

Probabilistic models for text understanding

Posted on:2015-05-24Degree:Ph.DType:Dissertation
University:The University of Texas at DallasCandidate:Erekhinskaya, Tatiana NikolaevnaFull Text:PDF
GTID:1478390017993420Subject:Computer Science
Abstract/Summary:
Text understanding is a key task of natural language processing and artificial intelligence. The challenge is to understand not only what is explicitly said in text, but also what is implied. A system can achieve this by connecting relevant world knowledge with the newly read information and making further inference.;The main challenges of this task are large scale and high degree of uncertainty, which suggests usage of probabilistic models. The general architecture of such system has three components: a language to represent information from the text and semantic parser to extract the representation, a large knowledge base containing world knowledge, and a probabilistic model able to link relevant knowledge to the input text.;This work presents a novel probabilistic approach to semantic parsing of sentences. The representation is based on a fixed set of dyadic relations. The approach uses Markov logic to express properties of relations and structural constraints. A new sampling algorithm is presented, which significantly outperforms existing algorithms like MaxWalkSAT both in speed and quality of results. Joint inference in Markov logic allows to significantly increase recall while keeping the same or better level of precision compared to isolated classification. The experiments showed that the quality is comparable to the state-of-the-art semantic role labeling system when compared using a common subset of relations.;The semantic parser is used to build a knowledge base from WordNet glosses. The resulting structures are converted into probabilistic implications. The probabilistic approach allows to perform abductive reasoning inverting the direction of implications. To scale the inference, first-order logic inference is used first to construct proof trees, and then probabilistic inference is done over the trees.;Finally, probabilistic models are applied to semantic textual similarity, which measures the degree of semantic equivalence between sentences. This work presents a way to create tractable Markov logic models for the semantic textual similarity task, yielding the highest correlation among other Markov logic-based approaches. The generalized linear models-based approach simplifies the model and yields state-of-the-art results on a corpora containing simple sentences. The best results are achieved using probabilistic inference on the knowledge base.
Keywords/Search Tags:Probabilistic, Text, Knowledge base, Inference
Related items