Font Size: a A A

Related Words Of Concept Study Based On Chinese Wikipedia

Posted on:2013-09-30Degree:MasterType:Thesis
Country:ChinaCandidate:K F ZhouFull Text:PDF
GTID:2248330371492595Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
As the rapid development of the Internet and the information explosion of growth, people find it increasingly difficult to collect and search information. So it’s a challenge to find accurate and comprehensive information for searching technology in a limited time. And using semantic knowledge in search engine system becomes an important way to improve query performance.Because of many complex situations, the semantics in the meaning of individual word is not clear. Some traditional methods can’t solve the disambiguation problems. There are two methods in traditional semantic similarity calculation:first is using statistical methods in the large-scale corpus. But in real life, this method is not universal because of lack of big enough scale and accurate corpus. Second is based on the artificial construction of knowledge systems which also have some problems such as the small size of knowledge construction and the higher maintenance costs.Because of the new requirements of semantic knowledge, this paper focuses on the excavation of related words of concept and the new correlation calculation method based on Chinese Wikipedia. The specific contents are as follows:First, we analyze the structure of Chinese Wikipedia and build a words correlation data set with improved WLVM methods. After analyzing the data set, we get some related words of concepts which could be used for the expression of the concept and be also be used in some aspects of natural language processing, for example, text expansion, construction of knowledge system and so on.Second, we propose a new method for semantic correlation between words. On the basis of the analysis of the previous word correlation calculation methods on large-scale corpus or artificial encyclopedic knowledge systems, this paper presents a semantic correlation computation between words with comprehensive utilization of internal link, the category taxonomy, article and the anchor text of Chinese Wikipedia. Compared to some methods, our results are more effective in experiments.
Keywords/Search Tags:Chinese Wikipedia, word correlation, semantic knowledge, related words ofconcept
PDF Full Text Request
Related items