Font Size: a A A

Term Similarity Calculation Method Based On Bayesian Network

Posted on:2014-10-10Degree:MasterType:Thesis
Country:ChinaCandidate:S ZhouFull Text:PDF
GTID:2268330401473525Subject:Pattern Recognition and Intelligent Systems
Abstract/Summary:PDF Full Text Request
The term similarity computing is a hot research field of natural language processing.It can meet the needs of the user to obtain information better. As an expansion form of probabilistic inference of the model, Bayesian network can be used to describe the semantic concepts between the terms and conditions probability, and this is the basis for the calculation of the completion of the term similarity as well as the user’s query, execution speeding, rigorous reasoning. As the method of dealing with the probability of problems of the field of artificial intelligence, in the past ten years, the Bayesian network has been applied to the term field of semantic similarity. Nowadays, in term similarity calculation, there are two main ways.First, the term similarity algorithm based on the domain collections, this method is more depended on the quality of the corpus. Second, the term similarity algorithm based on the network open source,this method depends heavily on prior knowledge.Artificial collected corpus is affected by the interference of different factors, or the lack of information which often led to the decline in the quality and impact of term similarity calculation accuracy. In order to solve the problems of these two aspects, in this paper, we propose a method of combining the term similarity and probability of the Bayesian network reasoning with probability logic, using Bayesian network probability logic reasoning to complete term similarity calculation. In this paper, we mainly completed the following research work:(1)Take MyEclipse9.0as a platform to complete the calculation of term similarity based on the HowNet. Using the structure of the term similarity to describe the relationship rate between the corpus. Based on Chinese Linguistics Research Center of Peking University Corpus, we select related terms to complete the calculation of similarity term similarity calculation, compare the results with the calculated results based on HowNet. The comparison results show that there is a certain gap between the similarity computation based on the domain collections term similarity calculation and based on open source terms(2)Quantifying the term similarity calculation results to the probability by the method of conditional event algebra.By the parameters learning of the Bayesian network,we initially get a Bayesian network topology diagram. Taking advantage of conditional event algebra logical reasoning function, we complete related terms similarity calculation of the theoretical.(3)We take MATLAB as a platform to achieve building the network configuration diagram of the Bayesian network, to calculate the conditional probability relationships between nodes.Calculating the value of the similarity between the terms by the conditional probability formula. Using K2algorithm to optimize the network structure and complete terminology logical reasoning.Comparing and analysising the theoretical and experimental results, illustrating the rationality and effectiveness of the experimental method of this article.
Keywords/Search Tags:term similarity, conditional event algebra, Bayesian network, MATLAB
PDF Full Text Request
Related items