Font Size: a A A

Automatic Recognition Of Chinese Synonyms For Information Retrieval

Posted on:2006-03-15Degree:MasterType:Thesis
Country:ChinaCandidate:Y LuFull Text:PDF
GTID:2168360152993968Subject:Information Science
Abstract/Summary:PDF Full Text Request
Automatic recognition of synonyms plays an important role in information retrieval. In addition, it is broadly used in other many applications, such as auto-indexing, auto-classification, machine translation etc.In aboard, there is no special research on the recognition of synonyms .At present, the associated research mainly focus on the semantic similarity of similar terms or related terms. In recent years domestic researches for the recognition of Chinese synonyms includes: word-form-similarity approach and thesaurus based approach.In order to enhance the ability of the synonyms recognition, this paper presents two methods. The first method is the PageRank algorithm based on the definition in dictionary, we analyze the relation links between a given words and other words, then construct the associated word graph, each associated word is a vertex of the graph and there is an edge from u to v if u appears in the definition of v. Finally, we use the PageRank algorithm to calculate the similarity degree and discover synonyms in the associated word graph. The second method is pattern matching algorithm based on the patterns of dictionary definition, we form some extraction rules by hands, the system then automatic extract synonyms by pattern matching. In addition, we use pattern matching algorithm to extract synonyms from the web and other text of periodical article in economic area.We build the synonyms recognition system with platform of Visual Basic. Net. In order to be able to evaluate the system performance, we examine the result given by the system. The practice of the synonyms recognition of financial dictionaries show that the precision of PageRank algorithm and pattern matching algorithm reaches 95% and 85.6% respectively. The test result indicated that the system is feasible and practical.This experimental system needs further improvement, such as the methods of pattern gathering, the coverage of the derived dictionary and the size & scope of the test set.
Keywords/Search Tags:Chinese synonyms, Automatic recognition, Pattern matching, Word definition, PageRank algorithm
PDF Full Text Request
Related items