Font Size: a A A

Named Entity Disambiguation Based On Wikipedia

Posted on:2012-03-09Degree:MasterType:Thesis
Country:ChinaCandidate:B R TangFull Text:PDF
GTID:2218330362453597Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
The sense of the word is a specific language phenomenon of a word in the specific language environment. In the natural language, a word usually has more than one meaning, and it is called the ambiguity of the words. Word disambiguation is to make the computer select the correct meaning of the words. It is the most difficult problem in the granularity of word in natural language processing. And whether it is solved well or not directly affects the application matters of natural language processing. At the same time, the universality of the polysemant determines the fact that the disambiguation of words becomes the central issue of many application matters such as machine translation, information retrieval, natural language semantic analysis, syntax analysis, speech recognition and text-to-speech conversion. Named entity is a very important language unit which contains a lot of information in the context. The ambiguity problem of the named entity has become a serious problem that needs to be solved as soon as possible.This thesis focuses on the mainstream of named entity disambiguation methods and analyzes the characteristic and drawback of different disambiguation methods. Then this thesis advances a named entity disambiguation method based on Wikipedia. The major contents and achievements for this thesis are:1. Introduces the research history and the present situation of the issue of named entity disambiguation. Discusses the concept, classification and basic measures of named entity disambiguation.2. Discusses and analyses the basic principle of traditional named entity disambiguation methods. Studies intensively the feature selection, disambiguation process and the advantages and disadvantages of the major named entity disambiguation methods.3. Proposes a named entity disambiguation method based on Wikipedia. It picks up several features and gets the final results of disambiguation by some machine learning algorithms.4. Conducts comparison experiments on given testing data. The results of the experiments show that the proposed method provides higher accuracy than the traditional disambiguation methods.
Keywords/Search Tags:Named Entity, Disambiguation, Feature Selection, Wikipedia
PDF Full Text Request
Related items