Font Size: a A A

An Approach For Word Sense Disambiguation Based On WordNet

Posted on:2014-02-10Degree:MasterType:Thesis
Country:ChinaCandidate:H X WanFull Text:PDF
GTID:2248330395997465Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rise of the internet, a large number of knowledge stored in the form of naturallanguage is saved as storage media such as Microblogging, Web Pages, Forums and Post Bar,etc. So the mining of knowledge, Natural Language Processing, Knowledge credibilityresearch beome several popular research direction. As the widespread ambiguity of naturallanguage, natural language processing becomes very difficult. As a basic research of naturallanguage processing, word sense disambiguation usually used in information retrieval (IR),information extraction (IE), machine translation (MT), content analysis (CP) and other fields.In the natural language processing, word sense disambiguation is the emphasis and difficultywhile studying, it has important theoretical and practical significance for the other languageinformation processing tasks. Word sense disambiguation as an "intermediate task", oftenbeen used in such as parsing, machine translation, text processing, speech recognition,information retrieval system. Thus, an important process of natural language processing, itsresearch results can be directly applied to many aspects of natural language processing.Word sense disambiguation system needs a lot of knowledge reasoning for inference, butthe lack of knowledge leads to low accuracy and low coverage problem, which is called theknowledge acquisition bottleneck. The above-mentioned problems prevented enhance theperformance of word sense disambiguation system, limiting the practical use of the wordsense disambiguation system. Another, word sense disambiguation is determine the meaningsof ambiguous words based on their context automatically, the knowledge need to determinethe meaning of polysemy is contained in sentence or chapter which the polysemy in. Wordsense disambiguation is to determine the meanings of ambiguous words based on theircontext automatically, context is important for word sense disambiguation. Word sensedisambiguation is the ability to identify the meaning of words in context in a computationalmanner, which is regarded as an AI-complete problem. The realization of word sensedisambiguation is difficult, firstly, we need to switch or convert non-structured documentsinto structured data, and then determine the word senses according to the knowledge offeredby Knowledge Base or the disambiguation rules defined by designers.In this paper, we construct a sense representative based on WordNet for getting rich knowledge for word sense disambiguation, so as to overcome the knowledge acquisitionbottleneck problem. In addition, the method in this paper uses WordNet as the onlyKnowledge Base, and does not requires any labeled training data, makes it possible to applythis system to the scene search project.Based on the above problems,this paper from how to obtain a wealth of knowledge fromthe knowledge source and build an effective context to commence a study. Since the wordsense disambiguation has been put forward, more than the several decades, in the past fewdecades there have been a lot of knowledge sources. In the English disambiguation, WordNetis the most commonly used Knowledge Base, it is a computational dictionary based on thelinguistics rules which is created by Princeton University and is choosed as the onlyKnowledge Base in this paper. After selected Knowledge Base, the next step is to study howto obtain rich disambiguation knowledge from WordNet, we construct three senserepresentative model based on WordNet for getting rich knowledge for word sensedisambiguation in this paper, so as to overcome the knowledge acquisition bottleneck problem.Another, word sense disambiguation is determine the meanings of ambiguous words based ontheir context automatically, context is important for word sense disambiguation which has adirect impact on the performance of word sense disambiguation system. So far, there are threekinds of extract context methods: window-based, dependency-based and base on syntacticanalysis. The next will introduce the several methods detailly, in this paper; extracts contextfeature words base on phrase structure tree (Ptree) combine with chunk analysis, and after theexperimental verification, this method proposed in this paper has effectively improved theperformance of word sense disambiguation system.The main research work and the results are as follows:1. Draw from the method of extracts context feature words base on phrase structure tree(Ptree), put forward extracts context feature words base on phrase structure tree (Ptree)combine with chunk analysis. Firstly, achieve syntactic analysis for the sentence which theambiguous word in, get the syntactic analysis tree; Secondly, extract the central word ascontextual feature words according to the center word rule table of chunk analysis; Lastly,realize Word sense disambiguation depending on the context.2. This paper presents multi-strategy word sense disambiguation, that is make full use ofsemantic relations betweeen synsets in WordNet, define different disambiguation strategiesaccording to the semantic relationships of synsets with different parts of speech, realize wordsense disambiguation depending on the context. 3. In this paper, construct three kinds of sense representative model based on WordNetfor getting rich knowledge for word sense disambiguation in this paper, and overcome theknowledge acquisition bottleneck problem.Linguist Firth commented word sense disambiguation as follows: watch their partnersknow these views, that is the meaning of ambiguous word is determined by its context,context is the only basis of word sense disambiguation,method of extract context in this paperprovide great help for the improvement of the performance of the word sense disambiguationsystem. For knowledge acquisition bottleneck problem, In this paper, construct three kinds ofsense representative model for getting rich knowledge for word sense disambiguation throughexcavating the wealth of knowledge in WordNet, has effectively improved the accuracy andcoverage of word sense disambiguation system.Test set chooses the English all wordsdisambiguation task in Senseval-2003, the test result was quite satisfactory.Knowledge-based word sense disambiguation system mainly includes the followingsections:(1) The document preprocessing, that is expressed as the context structured datawhich can be understand by computer;(2) mining knowledge according to the KnowledgeBase;(3) carry out semantic selection depending on the context and disambiguationknowledge.
Keywords/Search Tags:Natural Language Processing, word sense disambiguation, Context, Syntactic analysis, WordNet
PDF Full Text Request
Related items