Font Size: a A A

Research On Semantic Enhancing Relational Similarity Measurement

Posted on:2012-10-01Degree:MasterType:Thesis
Country:ChinaCandidate:S M CaiFull Text:PDF
GTID:2178330332467453Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The semantic relatedness between words and the relational similarity between pairs of words need to be dealt with quantitative analysis practically. Either of them is widely used in the field of natural language processing, such as information retrieval, information extraction, word sense disambiguation, machine translation, and so on. In recent years, the research about the semantic relatedness between words has achieved fruitful results, and the algorithm about the relational similarity between pairs of words has become a currently hot subject of research.The existing relational similarity algorithms between pairs of words are mainly divided into two types:semantic resource and statistic. The former algorithm calculates the similarity based on a manual semantic dictionary or a semantic web, and the latter is in a data-driven way completely, which means finding out the occurrence information between pairs of words in the context from a large corpus. This paper studies the measures of relational similarity between pairs of words, and in order to solve the problems of the existing relational similarity algorithms between pairs of words, two new relational similarity algorithms are proposed. One is based on latent semantic retrieve (LSR), and another is based on the combination of semantics with statistics (SSR). Main achievements are described as follows:Firstly, we studied the existing relational similarity algorithms, and through analyzing the existing relational similarity algorithms'advantages and disadvantages, we proposed a new relational similarity measure algorithm based on statistic, called LSR. In order to validate LSR's performance, we used the classical data to verify this new relational similarity algorithm. The results show that LSR has obvious advantages compared with the original algorithms, especially in time consuming.Secondly, according to the disadvantages of LSR and the advantages of algorithms of semantic relatedness between words, we proposed a new relational similarity algorithm based on the combination of semantics with statistics, called SSR. Experimental results show that the new relational similarity algorithm is more efficient than LSR, especially in accuracy. This paper is a useful exploration to natural language processing related issues.
Keywords/Search Tags:Semantic Relatedness, Relational Similarity, Wikipedia, WordNet, LSR
PDF Full Text Request
Related items